-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
attempting to fix racy Akka.Persistence.TestKit.Tests #7068
base: dev
Are you sure you want to change the base?
attempting to fix racy Akka.Persistence.TestKit.Tests #7068
Conversation
Took ~5 seconds to complete recovery with the Akka.Persistence.TestKit journal? Something's off then. |
try | ||
{ | ||
var highSequenceNr = await _breaker.WithCircuitBreaker((message, readHighestSequenceNrFrom, awj: this), state => | ||
state.awj.ReadHighestSequenceNrAsync(state.message.PersistenceId, state.readHighestSequenceNrFrom)); | ||
state.awj.ReadHighestSequenceNrAsync(state.message.PersistenceId, state.readHighestSequenceNrFrom)) | ||
.ConfigureAwait(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if there's some issues with the Akka.Persistence Journal
capturing context here, since these tasks run as detatched. I doubt it's the issue but it's one of the things that stood out to me here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The task isn't detached though, its actually awaited inside the circuit breaker, except when the circuit breaker is open, in which it just throws an open circuit exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true - we have some other racy CircuitBreaker
specs too (independently from any Akka.Persistence plugin) - wonder if that's the common issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its not the CircuitBreaker, the giant lag on recovery is between PreStart
and when the LoadSnapshot
message was received by the snapshot store actor:
[INFO][02/01/2024 18:30:23.919Z][Thread 0017][PersistenceExtension (akka://test)] Auto-starting journal plugin `akka.persistence.journal.test`
[INFO][02/01/2024 18:30:23.940Z][Thread 0017][PersistenceExtension (akka://test)] Auto-starting snapshot store `akka.persistence.snapshot-store.test`
[INFO][02/01/2024 18:30:23.999Z][Thread 0005][akka://test/system/akka.persistence.journal.test] Using write interceptor Noop
[INFO][02/01/2024 18:30:24.041Z][Thread 0012][akka://test/user/$b] Starting up and beginning recovery
[INFO][02/01/2024 18:30:29.807Z][Thread 0012][akka://test/system/akka.persistence.snapshot-store.test] LoadSnapshot message received.
[INFO][02/01/2024 18:30:29.809Z][Thread 0012][akka://test/system/akka.persistence.snapshot-store.test] Starting LoadSnapshotAsync circuit breaker.
[INFO][02/01/2024 18:30:29.810Z][Thread 0012][akka://test/system/akka.persistence.snapshot-store.test] Invoking LoadAsync inside circuit breaker.
[INFO][02/01/2024 18:30:29.812Z][Thread 0012][akka://test/system/akka.persistence.snapshot-store.test] Starting to intercept snapshot foo loading using interceptor Noop
[INFO][02/01/2024 18:30:29.812Z][Thread 0012][akka://test/system/akka.persistence.snapshot-store.test] Completed intercept snapshot foo loading using interceptor Noop
[INFO][02/01/2024 18:30:29.814Z][Thread 0012][akka://test/system/akka.persistence.snapshot-store.test] LoadSnapshotAsync circuit breaker completed.
[DEBUG][02/01/2024 18:30:29.838Z][Thread 0012][akka://test/system/akka.persistence.journal.test/$a] Replay completed: RecoverySuccess<highestSequenceNr: 0>
[INFO][02/01/2024 18:30:29.839Z][Thread 0012][akka://test/user/$b] Received recover Akka.Persistence.RecoveryCompleted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes me think that the bug is actually in the RecoveryPermitter
class
This reverts commit 30897ff.
Well this is puzzling, but reflects the problem using logging statements:
|
When that same test passes, the logs look like this:
|
|
So what is blocking the |
…h so that it wouldn't be masked by other exceptions
…heweb/akka.net into fix-AkkaPersistenceTestKitTests
Note that there are no async-await between the two log lines, only a simple code. The The |
Changing the actor dispatcher to "akka.actor.internal-dispatcher" eliminates the start-up delay, could it be that there's something wrong inside "akka.actor.default-dispatcher" implementation? |
We implemented this in #7117 - let's see how it does |
@Arkatufus looks to me like the dispatcher changes didn't have any impact |
Changes
Trying to fix a pair of specs with high flip rates in the Akka.NET test suite.
Checklist
For significant changes, please ensure that the following have been completed (delete if not relevant):